Information Retrieval
An ensemble diversity approach to supervised binary hashing
Binary hashing is a well-known approach for fast approximate nearest-neighbor search in information retrieval. Much work has focused on affinity-based objective functions involving the hash functions or binary codes. These objective functions encode neighborhood information between data points and are often inspired by manifold learning algorithms. They ensure that the hash functions differ from each other through constraints or penalty terms that encourage codes to be orthogonal or dissimilar across bits, but this couples the binary variables and complicates the already difficult optimization. We propose a much simpler approach: we train each hash function (or bit) independently from each other, but introduce diversity among them using techniques from classifier ensembles. Surprisingly, we find that not only is this faster and trivially parallelizable, but it also improves over the more complex, coupled objective function, and achieves state-of-the-art precision and recall in experiments with image retrieval.
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.61)
Query Complexity of Bayesian Private Learning
We study the query complexity of Bayesian Private Learning: a learner wishes to locate a random target within an interval by submitting queries, in the presence of an adversary who observes all of her queries but not the responses. How many queries are necessary and sufficient in order for the learner to accurately estimate the target, while simultaneously concealing the target from the adversary? Our main result is a query complexity lower bound that is tight up to the first order. We show that if the learner wants to estimate the target within an error of $\epsilon$, while ensuring that no adversary estimator can achieve a constant additive error with probability greater than $1/L$, then the query complexity is on the order of $L\log(1/\epsilon)$ as $\epsilon \to 0$. Our result demonstrates that increased privacy, as captured by $L$, comes at the expense of a \emph{multiplicative} increase in query complexity. The proof builds on Fano's inequality and properties of certain proportional-sampling estimators.
Meta-Learning MCMC Proposals
Effective implementations of sampling-based probabilistic inference often require manually constructed, model-specific proposals. Inspired by recent progresses in meta-learning for training learning agents that can generalize to unseen environments, we propose a meta-learning approach to building effective and generalizable MCMC proposals. We parametrize the proposal as a neural network to provide fast approximations to block Gibbs conditionals. The learned neural proposals generalize to occurrences of common structural motifs across different models, allowing for the construction of a library of learned inference primitives that can accelerate inference on unseen models with no model-specific training required. We explore several applications including open-universe Gaussian mixture models, in which our learned proposals outperform a hand-tuned sampler, and a real-world named entity recognition task, in which our sampler yields higher final F1 scores than classical single-site Gibbs sampling.
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.61)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.61)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.61)
The Search Engine for OnlyFans Models Who Look Like Your Crush
Presearch's "Doppelgänger" is trying to help people discover adult creators rather than use nonconsensual deepfakes. For three days in February, porn star Alix Lynx flew to Miami for her first exclusive creator gathering where she was in full grind mode: shooting Reels and talking strategy with other creators. "It was kind of like SoHo House for OnlyFans girls," she says of the experience, which is called The Circle and drew more than a dozen sex workers, including Remy LaCroix and Forrest Smith. Lynx, who is a former webcam model turned OnlyFans starlet, has a combined 2 million followers across Instagram, TikTok, and X . She joined OnlyFans in 2017 with "the luxury of having my own following," she says, but those numbers haven't always translated to subscriptions. It's why she was in Miami.
- North America > United States > California (0.14)
- North America > United States > New York (0.04)
- Europe > Slovakia (0.04)
- (4 more...)
- Government (0.95)
- Information Technology > Security & Privacy (0.69)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.42)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
- Europe > Portugal (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.96)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.64)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Oregon (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > China (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.68)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Minnesota (0.04)
- (6 more...)
- Information Technology (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Leisure & Entertainment (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Law (0.46)
- Information Technology (0.46)
- Government (0.46)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.94)
- (3 more...)